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Abstract 

Taking inspiration from the hypothesis of muscle synergies, we pro- 
pose a method to generate open loop controllers for an agent solving 
point-to-point reaching tasks. The controller output is defined as a lin- 
ear combination of a small set of predefined actuations, termed synergies. 
The method can be interpreted from a developmental perspective, since 
it allows the agent to autonomously synthesize and adapt an effective set 
of synergies to new behavioral needs. This scheme greatly reduces the 
dimensionality of the control problem, while keeping a good performance 
level. The framework is evaluated in a planar kinematic chain, and the 
quality of the solutions is quantified in several scenarios. 

1 Introduction 

Humans are able to perform a wide variety of tasks with great flexibility; learn- 
ing new motions is relatively easy, and adapting to new situations (e.g. change 
in the environment or body growth) is usually dealt with no particular effort. 
The strategies adopted by the central nervous system (CNS) to master the com- 
plexity of the musculoskeletal apparatus and provide such performance are still 
not clear. However, it has been speculated that an underlying modular orga- 
nization of the CNS may simplify the control and provide the observed adapt- 
ability. There is evidence that the muscle activity necessary to perform various 
tasks (e.g. running, walking, keeping balance, reaching and other combined 
movements) may emerge from the combination of predefined muscle patterns, 
the so-called muscle synergies [1]. This organization seems to explain muscle 
activity across a wide range of combined movements [2-4]. 

The scheme of muscle synergies is inherently flexible and adaptable. Differ- 
ent actions are encoded by specific combinations of a small number of predefined 
synergies; this reduces the computational effort and the time required to learn 
new useful behaviors. The learning scheme can be regarded as developmental 
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since information previously acquired (i.e. synergies) can be reused to gener- 
ate new behaviors [5]. Finally, improved performance can be easily achieved 
by introducing additional synergies. Thus, the hypothetical scheme of muscle 
synergies would contribute to the autonomy and the flexibility observed in bi- 
ological systems, and it could inspire new methods to endow artificial agents 
with such desirable features. 

In this paper we propose a method to control a dynamical system (i.e. the 
agent) in point-to-point reaching tasks by linear combinations of a small set of 
predefined actuations (i.e. synergies). Our method initially solves the task in 
state variables by interpolation; then, it identifies the combination of synergies 
(i.e. actuation) that generate the closest kinematic trajectory to the computed 
interpolant. Additionally, we propose a strategy to synthesize a small set of 
synergies that is tailored to the task and the agent. The overall method can be 
interpreted in a developmental fashion; i.e. it allows the agent to autonomously 
synthesize and update its own synergies to increase the performance of new 
reaching tasks. 

Other researchers in robotics and control engineering have recently proposed 
architectures inspired by the concept of muscle synergies. In [6] the authors de- 
rive an analytical form of a set of primitives that can drive a feedback linearized 
system (known analytically) to any point of its configuration space. In [7] the 
authors present a numerical method to identify synergies that optimally drive 
the system over a set of desired trajectories. This method does not require an 
analytical description of the system, and it has the advantage of assessing the 
quality of the synergies in task space. However, it is computationally expen- 
sive as it involves heavy optimizations. In [8] muscle synergies are identified by 
applying an unsupervised learning procedure to a collection of sensory-motor 
data obtained by actuating a robot with random signals. In [9] the architec- 
ture of the dynamic movement primitives (DMP) is proposed as a novel tool to 
formalize control policies in terms of predefined differential equations. Linear 
combinations of Gaussian functions are used as inputs to modify the attractor 
landscapes of these equations, and to obtain the desired control policy. 

In contrast to these works, our method to synthesize synergies does not 
rely on feedback linearization, nor on repeated integrations of the dynamical 
system. The method is grounded on the input-output relation of the dynamical 
system (as in [8]), and it provides a computationally fast method to obtain 
the synergy combinators to solve a given task. Furthermore, our method is 
inherently adaptable as it allows the on-line modification of the set of synergies 
to accommodate to new reaching tasks. 

2 Definitions and Methods 

In this section we introduce the mathematical details of the method we propose. 
After some definitions, we present the core element of our method: a general 
procedure to compute actuations that solve point-to-point reaching tasks (see 
Sec. 2.1). Subsequently, in Section 2.2, we propose a framework for the synthesis 
and the development of a set of synergies. 

Let us consider a differential equation modeling a physical system 
V (q(t)) = u(t), where q(t) represents the time-evolution of its configuration 
variables (their derivatives with respect to time are q(t)), and u(t) is the actu- 
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ation applied. Inspired by the hypothesis of muscle synergies 1 [1], we formulate 
the actuation as a linear combination of predefined motor co- activation patterns: 

u(*) = g :=*(*)&, (1) 

i=l 

where the functions <pi(i) G <E> are called motor synergies. The notation <&(t) 
describes a formal matrix where each column is a different synergy. If we con- 
sider a time discretization, 4>(£) becomes a N dim(qr)-by-7V0 matrix, where N is 
the number of time steps, dim(qr) the dimension of the configuration space and 
the number of synergies. 

We define dynamic responses (DR) of the set of synergies as the responses 
9i(t) G of the system to each synergy (i.e. forward dynamics): 

v(0i(t)) = fait) i = i..JV (2) 

with initial conditions chosen arbitrarily. 



2.1 Solution to point-to-point reaching tasks 

A general point-to-point reaching task consists in reaching a final state (q Tl q T ) 
from an initial state (q , q ) in a given amount of time T: 

<?(0) = q , 4(0) = q , ^ 
q(T) = q T , q(T) = q T . 

Controlling a system to perform such tasks amounts to finding the actuation u(i) 
that fulfills the point constraints 2 (3). Specifically, assuming that the synergies 
are known, the goal is to identify the appropriate synergy combinators b. In this 
paper we consider only the subclass of reaching tasks that impose motionless 
initial and final postures, i.e. q T = q = 0. 

The procedure consists of, first, solving the problem in kinematic space (i.e. 
finding the appropriate q(i)), and then computing the corresponding actuations. 
From the kinematic point of view, the task can be seen as an interpolation 
problem; i.e. q(t) is a function that interpolates the data in (3). Therefore, a set 
of functions is used to build the interpolant trajectory that satisfy the constraints 
imposed by the task; these functions are herein the dynamic responses of the 
synergies: 

No 

q(t) = ^O i (t)a i :=&(t)a, (4) 
i=i 

where the vector of combinators a is chosen such that the task is solved. As 
mentioned earlier, if time is discretized, &(t) becomes a N dim(q)-by-No ma- 
trix, where Nq is the number of dynamic responses. The quality of the DR as 
interpolant s is evaluated in sections 3. 

Once a kinematic solution has been found (as linear combination of DRs) , the 
corresponding actuation can be obtained by applying the differential operator; 

1 With respect to the model of time- varying synergies, in this paper we neglect the synergy 
onset times. 

2 In this paper we assume that the initial conditions of the systems are equal to (q , q ) 
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i.e. V (&{t)a) = u(t). Finally, the vector b can be computed by projecting u(t) 
onto the synergy set <I>. If u(t) does not belong to the linear span of <£, the 
solution can only be approximated in terms of a defined norm (e.g. Euclidean): 

b = argmin \ \u(t) — <$>(t)b\\. (5) 

b 

When the time is discretized, all functions of time becomes vectors and this 
equation can be solved explicitly using the psuedoinverse of the matrix 3>, 

$ + u = $ + £>(0a) =6. (6) 

This equation highlights the operator <I> + opo0 (o denotes operator composi- 
tion) as the mapping between the kinematic combinators a (kinematic solution) 
and the synergy combinators b (dynamic solution). Generically, this operator 
represents a nonlinear mapping M. : M Ne — >> R^, and it will be discussed in 
Section 4. 

To assess the quality of the solution we define the following measures: 
Interpolation error: Measures the quality of the interpolant &{t)a with respect 
to the task. Strictly speaking, only the case of negligible errors corresponds 
to interpolation. A non-zero error indicates that the trajectory ®(t)a only 
approximates the task 

err, = y/\\q T - €>(T)a\\ 2 + ||0(T)a|| 2 , (7) 

where || • || denotes the Euclidean norm, and the difference between angles are 
mapped to the interval (— 7r,7r]. 

Projection error: Measures the distance between the actuation that solves the 
task u(i), and the linear span of the synergy set 



errp = 




Forward dynamics error: Measures the error of a trajectory q(t, A) generated 
by an actuation &(t)\, in relation to the task. 

err F = \J\ \q(T, A) — q T | | 2 + A) — q T | | 2 . (9) 

Replacing q(t,\), q T and q T with their corresponding end-effector values pro- 
vides the forward dynamics error of the end-effector. 

2.2 Synthesis and Development of Synergies 

The synthesis of synergies is carried on in two phases: exploration and reduction. 
The exploration phase consists in actuating the system with an extensive set of 
motor signals <3>o in order to obtain the corresponding DRs 0o- The reduction 
phase consists in solving a small number of point-to-point reaching tasks in 
kinematic space (that we call proto-tasks) by creating the interpolant s using the 
elements of set 0o, as described in Eq. (4). These solutions are then taken as 
the elements of the reduced set 0. Finally, the synergy set $ is computed using 
relation (2), i.e. inverse dynamics. As a result, there will be as many synergies 
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as the number of the proto-tasks (i.e. = Nq). The intuition behind this 
reduction is that the synergies that solve the proto-tasks may capture essential 
features both of the task and of the dynamics of the system. Despite the non- 
linearities of P, linear combination of these synergies might be useful to solve 
point-to-point reaching tasks that are similar (in terms of Eq. (3)) to the proto- 
tasks (see Sec. 3). 

The number of proto-tasks as well as their specific instances determine the 
quality of the synergy-based controller. To obtain good performance in a wide 
variety of point-to-point reaching tasks, the proto-tasks should cover relevant 
regions of the state space (see Sec. 3). Clearly, the higher the number of differ- 
ent proto-tasks, the more regions that can be reached with good performance. 
However, a large number of proto-tasks (and the corresponding synergies) in- 
creases the dimensionality of the controller. In order to tackle this trade-off, we 
propose a procedure that parsimoniously adds a new proto-task only when and 
where it is needed: if the performance in a new reaching task is not satisfactory, 
we add a new proto-task in one of the regions with highest projection error or 
we modify existing ones. 

3 Results 

We apply the methodology described in Section 2 to a simulated planar kine- 
matic chain (see [10] for model details) modeling a human arm[ll]. In the 
exploration phase, we employ an extensive set of motor signals <I>o to actuate 
the arm model and generate the corresponding dynamic responses 0o- The 
panels in the first row of Fig. 1 show the end-effector trajectories resulting from 
the exploration phase. We test two different classes of motor signals: actua- 
tions that generate minimum jerk end-effector trajectories (100 signals), and 
low-passed uniformly random signals (90 signals). In order to evaluate the va- 
lidity of the general method described in Sec. 2.1, we use the sets <E>o and 0o to 
solve 13 different reaching tasks without performing the reduction phase. The 
second row of Fig. 1 depicts the trajectories drawn by the end-effector when the 
computed mixture of synergies are applied as actuations (i.e. forward dynamics 
of the solution). It has to be noted how the nature of the solutions (as well 
as that of the responses), depends on the class of actuations used. The maxi- 
mum errors are reported in Table 1. The results are highly satisfactory for both 
the classes of actuations, and show the validity of the method proposed. Since 
the reduction phase has not been performed, the dimension of the combinator 
vectors a and b equals the number of actuations used in the exploration. 
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Table 1: Order of the maximum errors obtained by using <3> and 0o (no reduc- 
tion phase). 

The objective of the reduction phase is to generate a small set of synergies 
and DRs that can solve desired reaching tasks effectively. As described in Section 
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2.2, this is done by solving a handful of proto-tasks. The number (and the 
instances) of these proto-tasks determines the quality of the controller. Figure 
2 shows the projection error as a function of the number of proto-tasks. The 
reduction is applied to the low-passed random signal set. Initially, two targets 
are chosen randomly (top left panel); subsequent targets are then added on 
the regions characterized by higher projection error. As it can be seen, the 
introduction of new proto-tasks leads to better performance on wider regions 
of the end-effector space, and eventually the whole space can be reached with 
reasonable errors. In fact, the figure shows that this procedure decreases the 
average projection error to 10 -3 (comparable to the performance of the whole 
set 3>o? see Tab. 1) and reduces the dimension of the combinator vector to 6, 
a fifteen-fold reduction. This result shows that a set of "good" synergies can 
drastically reduce the dimensionality of the controller, while maintaining similar 
performance. The bottom right panel of the figure shows the forward dynamics 
error of the end-effector obtained with the 6 proto-tasks. Comparing this panel 
with the bottom left one, it can be seen that the forward dynamics error of the 
end-effector reproduces the distribution of the projection error, rendering the 
latter a good estimate for task performance. 

To further demonstrate that the reduction phase we propose is not trivial, 
we compare the errors resulting from the set of 6 synthesized synergies, with the 
errors corresponding to 100 random subsets of size 6 drawn from the set of low- 
passed random motor signals. Figure 3 shows this comparison. The task consists 
in reaching the 13 targets in Fig. 1. The boxplots correspond to the errors of 
the random subsets, and the filled circles to the errors of the synergies resulting 
from the reduction phase. Observe that, the order of the error of the reduced 
set is, in the worst case, equal to error of the best random subset. However, the 
mean error of the reduced set is about 2 orders of magnitude lower. Therefore, 
the reduction by proto-tasks can produce a parsimonious set of synergies out of 
a extensive set of actuations. Evaluating the performance with different classes 
of proto-tasks (e.g. catching, hitting, via-points) is postponed to future works. 

4 Discussion 

The results shown in the previous section justify the interpretation of the method- 
ology as a developmental framework. Initially, the agent explores its sensory- 
motor system employing a variety of actuations. Later, it attempts to solve the 
first reaching tasks (proto-tasks), perhaps obtaining weak performance as the 
exploration phase may not have produced enough responses yet (see the box- 
plots in Fig. 3). If the agent finds an acceptable solution to a proto-task, it is 
used to generate a new synergy (populating the set <£), otherwise it continues 
with the exploration. The failure to solve tasks of importance for its survival, 
could motivate the agent to include additional proto-tasks; Figure 2 illustrates 
this mechanism. As it can be seen, the development of the synergy set incre- 
mentally improves the ability of the agent to perform point-to-point reaching. 
Alternatively, existing proto-tasks could be modified by means of a gradient de- 
scent or other learning algorithms. In a nutshell, the methodology we propose 
endows the agent with the ability to autonomously generate and update a set 
of synergies (and dynamic responses) that solve reaching tasks effectively. 

Despite the difficulty of the mathematical problem (i.e nonlinear differential 
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Figure 1: Comparison of explorations with two different classes of actuation: 
minimum jerk and low-passed random signal. Each panel shows the kinematic 
chain in it initial posture (straight segments). The limits of the end-effector are 
shown as the boundary in solid line. 



operator), our method seems to generate a small set of synergies that span the 
space of actuations required to solve reaching tasks. This is not a trivial result, 
since these synergies over-perform many other set of synergies randomly taken 
from the set 3>o (see Fig. 3). It appears as if the reduction phase builds features 
upon the exploration phase, that are necessary to solve new reaching tasks. To 
verify whether solving proto-tasks plays a fundamental role, our synergies could 
be compared with the principal components extracted from the exploration set. 
This verification goes beyond the scope of this paper. 

An important aspect of our method is the relation between and <I> (see 
Eq. (2)). This mapping makes explicit use of the body parameters (embed- 
ded in the differential operator V\ hence the synergies obtained can always be 
realized as actuations. The same cannot be said, in general, for synergies identi- 
fied from numerical analyses of biomechanical data. Though some studies have 
verified the feasibility of extracted synergies as actuations [12], biomechanical 
constraints are not explicitly included in the extraction algorithms. Addition- 
ally, Eq. (2) provides an automatic way to cope with smooth variations of the 
morphology of the agent. That is, both the synergies and their dynamic re- 
sponses evolve together with the body. In line with [6, 7], these observations 
highlight the importance of the body in the hypothetical modularization of the 
CNS. 

Once the task is solved in kinematic space, the corresponding actuation can 
be computed using the explicit inverse dynamical model of the system (i.e. the 
differential operator V). It might appear that there is no particular advantage in 
projecting this solution onto the synergy set. However, the differential operator 
might be unknown. In this case, a synergy-based controller would allow to 
compute the appropriate actuation by evaluating the mapping Ai on the vector 
a, hence obtaining the synergy combinators b. Since M is a mapping between 



7 



1Q- 1 1Q- 2 1Q- 3 1Q- 4 1Q- 5 




X [m] X [m] 

Figure 2: Selection of targets based on projection error. Each panel shows the 
kinematic chain in its initial posture (straight segments). The limits of the 
end-effector are the boundary of the colored regions. The color of each point 
indicates the projection error produced to reach a target in that position. The 
bottom right diagram shows the forward dynamics error of the end-effector using 
6 proto-tasks (6 synergies). 
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Figure 3: Evaluation of the reduction phase. Errors produced by subsets ran- 
domly selected from the exploration-actuations (boxplots) are compared with 
the errors obtained after the reduction phase (filled circles). 
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two finite low- dimensional vector spaces, estimating this map may turn to be 
easier than estimating the differential operator V. Furthermore, we believe that 
the explicit use of V may harm the biological plausibility of our method. In order 
to estimate the map M , the input-output data generated during the exploration 
phase (i.e. <l>o and 0q) could be used as learning data-set. Further work is 
required to test these ideas. Additionally, preliminary theoretical considerations 
(not reported here) indicate that the synthesis of synergies without the explicit 
knowledge of V is also feasible. 

Finally, the current formulation of the method does not includes joint limits 
explicitly. The interpolated trajectories are valid, i.e. they do not go beyond 
the limits, due to the lack of intricacy of the boundaries. In higher dimensions, 
especially when configuration space and end-effector are not mapped one-to-one, 
this may not be the case anymore. Nevertheless, joint limits can be included by 
reformulating the interpolation as a constrained minimization problem. Another 
solution might be the creation of proto-tasks with a tree-topology, relating our 
method to tree based path planning algorithms [13]. 

5 Conclusion and Future Work 

The current work introduces a simple framework for the generation of open loop 
controllers based on synergies. The framework is applied to a planar kinematic 
chain to solve point-to-point reaching tasks. Synergies synthesized during the 
reduction phase over-perform hundreds of arbitrary choices of basic controllers 
taken from the exploration motor signals. Furthermore, our results confirm 
that the introduction of new synergies increases the performance of reaching 
tasks. Overall, this shows that our method is able to generate effective syner- 
gies, greatly reducing the dimensionality of the problem, while keeping a good 
performance level. Additionally, the methodology offers a developmental in- 
terpretation of the emergence of task-related synergies that could be validated 
experimentally. 

Due to the nonlinear nature of the operator P, the theoretical grounding of 
the method poses a difficult challenge, and it is the focus of our current research. 
Another interesting line of investigation is the validation of our method against 
biological data, paving the way towards a predictive model for the hypothesis 
of muscle synergies. Similarly, the development of an automatic estimation 
process for the mapping M would further increase the biological plausibility of 
the model. 

The inclusion of joint limits into the current formulation must be prioritized. 
Solving this problem will allow to test the method on higher dimensional redun- 
dant systems. Tree-based path planning algorithms may offer a computationally 
effective way to approach the issue. 
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